Optimal haplotype block-free selection of tagging SNPs for genome-wide association studies.

نویسندگان

  • Bjarni V Halldórsson
  • Vineet Bafna
  • Ross Lippert
  • Russell Schwartz
  • Francisco M De La Vega
  • Andrew G Clark
  • Sorin Istrail
چکیده

It is widely hoped that the study of sequence variation in the human genome will provide a means of elucidating the genetic component of complex diseases and variable drug responses. A major stumbling block to the successful design and execution of genome-wide disease association studies using single-nucleotide polymorphisms (SNPs) and linkage disequilibrium is the enormous number of SNPs in the human genome. This results in unacceptably high costs for exhaustive genotyping and presents a challenging problem of statistical inference. Here, we present a new method for optimally selecting minimum informative subsets of SNPs, also known as "tagging" SNPs, that is efficient for genome-wide selection. We contrast this method to published methods including haplotype block tagging, that is, grouping SNPs into segments of low haplotype diversity and typing a subset of the SNPs that can discriminate all common haplotypes within the blocks. Because our method does not rely on a predefined haplotype block structure and makes use of the weaker correlations that occur across neighboring blocks, it can be effectively applied across chromosomal regions with both high and low local linkage disequilibrium. We show that the number of tagging SNPs selected is substantially smaller than previously reported using block-based approaches and that selecting tagging SNPs optimally can result in a two- to threefold savings over selecting random SNPs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome-Wide Tagging SNPs with Entropy-Based Monte Carlo Method

The number of common single nucleotide polymorphisms (SNPs) in the human genome is estimated to be around 3-6 million. It is highly anticipated that the study of SNPs will help provide a means for elucidating the genetic component of complex diseases and variable drug responses. High-throughput technologies such as oligonucleotide arrays have produced enormous amount of SNP data, which creates ...

متن کامل

A sparse marker extension tree algorithm for selecting the best set of haplotype tagging single nucleotide polymorphisms.

Single nucleotide polymorphisms (SNPs) play a central role in the identification of susceptibility genes for common diseases. Recent empirical studies on human genome have revealed block-like structures, and each block contains a set of haplotype tagging SNPs (htSNPs) that capture a large fraction of the haplotype diversity. Herein, we present an innovative sparse marker extension tree (SMET) a...

متن کامل

Haplotype Block Partitioning and tagSNP Selection under the Perfect Phylogeny Model

Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome.Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. Theparticular pattern of these common variations forms a block-like structure on human genome. In this work,we develop a new method based on the Perfect Phylogeny Model to identify haplo...

متن کامل

Efficient Haplotype Block Partitioning and Tag SNP Selection Algorithms under Various Constraints

Patterns of linkage disequilibrium plays a central role in genome-wide association studies aimed at identifying genetic variation responsible for common human diseases. These patterns in human chromosomes show a block-like structure, and regions of high linkage disequilibrium are called haplotype blocks. A small subset of SNPs, called tag SNPs, is sufficient to capture the haplotype patterns in...

متن کامل

Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.

Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs) is sufficient to capture most of the haplotype structure of the human genome. In this paper, we develop a method to partition haplotypes into blocks and to identify tag SNPs based on genotype data ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genome research

دوره 14 8  شماره 

صفحات  -

تاریخ انتشار 2004